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1.  INTRODUCTION 

Consider  an  organism  that  displays  observable  stages  (0,1,..., A) 
throughout  its  lifetime.   Two  examples  are:  (1)  a  holometabolous 
insect  displays  egg,  larvae,  pupa  and  adult  stages  (see  Ross,  1965) 
and  (2)  an  annual  herbaceous  angiosperm  displays  seed,  seedling, 
vegetative,  flowering  and  senescent  stages  (see  Wilson,  Loomis  and 

F 
Steeves,  1971).   Let  (t.).  -  be  an  increasing  sequence  of  fixed  sample 

points  in  time  (-  sample  times)  such  that  a  cohort  of  organisms  (i) 
begins  in  stage  0  at  t   and  (ii)  ends  in  stage  A  at  t  .   At  each  t.  a 

sample  of  n.  (n,  —  1,  2,  ...)  organisms  is  selected  and  the  stage  of 

each  organism  is  determined  and  recorded.   Assume  that  failures 

(deaths)  do  not  occur  and  that  the  samples  are  selected  from  an 

infinitely  large  population. 

The  above  experiment  is  applicable  in  two  cases.   In  Case  I,  each 

organism  is  observed  at  each  t. .   Hence,  n.-  N  where  N  is  the  total 
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number  of  organisms  in  the  cohort.   In  Case  II,  the  stage  of  the 

organism  can  only  be  determined  by  sacrifice;  i.e.:  determined  by 

destruction  of  the  organism  or  the  habitat  that  is  necessary  for  the 

organism's  survival.   For  example,  (1)  an  adult  female  insect  is 

sacrificed  to  determine  its  ovarian  stage  and  (2)  a  host  (habitat)  is 

destroyed  to  determine  the  developmental  stage  of  a  parasite  (see 

Ross,  1965).   Thus  an  independent  subset  n.  of  the  cohort  is  observed 

at  each  t. . 


Data  resulting  from  such  an  experiment  are  tabulated  as  in  Table 
1,  where  n    is  the  number  of  organisms  observed  in  stage  s  at  t.. 

Note  that  n  -  )    n  Conditions  (i)  and  (ii)  on  (t, )f  „  imply 

■'-s-O    '  1  1-0 

"O  "  "0,0   ^"0,s  -  0.  0  <  s  ^  A)  and  np  -  n^^^   (nj,^^  -  0,  0  <  s  <  A) . 
Hence,  the  estimators  to  be  discussed  are  appropriate  only  when  all 
organisms  in  the  cohort  begin  in  stage  0  and  end  in  stage  A. 

Table  1.  Format  of  stage  frequency  data. 

sample  stage  total 


no. 


^0-° 

"0,0 

0 

0 

'l 

"1,0 

"1,1    ■ 

■         "l,s 

h 

"2,0 

"2,1        • 

■        "2,s 

°  "0 


'1,A     "1 
"2,  A    "2 


n.  „      n.  1 
1,0       1,1 


Given  data  from  such  an  experiment,  a  clear  interest  is  to 
provide  estimates  of  the  parameters  (mean,  variance,...)  of  the 
distribution  of  the  time  to  a  particular  stage  s  given  the  time  in  any 


other  stage  s'  <  s  (0  <  s'  <  A) .   Previous  research  on  stage  frequency 
data  centered  on  estimation  of  survival  or  mortality  rates  (Bellows, 
Ortiz,  Owens  and  Huddleston,  1982;  Birley,  1977;  Kiritani  and 
Nakasuji,  1967;  Manly,  1974).   Estimation  of  stage  recruitment  was 
researched  by  Kobayashl  (1968).   Estimators  of  mean  stage  duration 
have  been  proposed  by  Boyer  and  Deaton  (1984),  Manly  (1976,  1977)  and 
Mills  (1981).   Mills  (1981)  estimated  mean  stage  duration  time  based 
on  the  arithmetic  means  of  recruitment  and  stage  frequencies,  scaled 
by  a  "shift  of  mean"  factor  to  account  for  stage  mortality.   Manly 
(1976,  1977)  estimated  mean  stage  duration  time  by  applying  a 
trapezoid  approximation  to  the  observed  stage  "frequency  estimates." 
Variance  of  mean  stage  duration  time  was  estimated  using  linear 
regression  sequentially  on  three  frequency  estimates  (analogous  to  a 
moving  average).   Manly  (1985)  extended  his  methodology  to  data  with 
left  or  right  censoring  of  sample  times.   Boyer  and  Deaton  (1984) 
estimated  mean  time  to  stage  s  In  Case  II  by  first  constructing  a 
survival  cumulative  distribution  function  (cdf)  based  on  the 
proportion  of  organisms  not  yet  attaining  stage  s  and  then  applying 
Rlemann  sums  to  estimate  the  mean  time  to  stage  s.   This  report  is  an 
extension  of  Boyer  and  Deaton  (1984) ,  so  we  review  their  approach 
next. 

Let  the  random  variable  T  e    [0,  »[  be  the  time  to  stage  s  for  an 

organism.   Let  T^  have  the  cdf  F  .   The  survival  function  for  stage  s 

is  P(T  >  t)  -  G  (t)  -  1  -  F  (t).   For  each  t. ,  0  <  1  <  F,  let  p 

s  S  S  1  1  ,  s 

P(Tg  >  <^^)    -   2g(t^).   G^(tj^)  is  the  probability  that  an  organism 
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reaches  stage  s  after  time  t.  and,  equlvalently ,  G  (t.)  Is  the 
probability  that  an  organism  will  not  be  In  stage  s  by  t..   Define  an 


estimator  of  p,   bv 

i,s  ^ 


rs-1 


1  \^-" 
p.   -  -   >    n.  . 

(le:  the  proportion  of  organisms  In  the  sample  not  yet  in  stage  s  by 

time  t  ).   The  quantity  )   n.  .  has  a  binomlalfp.   ,  n. )  distribution 

and  P,    Is  the  "usual"  unbiased  estimator  of  p.    (see,  for  example. 
Mood,  Graybill  and  Boes ,  1974).   The  p    would  be  uniformly  minimum 

variance  unbiased  if  p,    contained  information  censored  from  above 

and  below  t . . 
1 

Since  E(T^)  -  J^  G^(t)dt  and  G  (t)  Is  monotonlc  nonincreasing,  we 

^F-1 
can  estimate  E(T  )  by  the  Riemann  sums  U  -  )    G(t.)(t  .,  -  t  )  and 


rF-1 


L  -  >   G(t    )(t    -  t  ),  where  U  (L)  Is  an  upper  (lower)  bound  and 

^i_o   ■^■^■^   ^"^-^    ^ 

F 
(Cj^)j^_Q  Is  the  partition.   Averaging  U  and  L  results  in  an  approximate 

expression  for  the  mean  time  to  stage  s 
rF-1/- 


'"-^  =U._J'=s(^>  +Gs(Vl^)(Vl  -  \^ 


(1.2) 


Substituting  p^^^^  for  0^(.t^)   and  p^^^^^  ^  for  G^(tj^_^j^)  and  noting  that 
Cq  -  0,  Pq  g-  1  and  p^  ^-  0,  an  estimator  of  E(T  )  is 

Thus  E(T  )  is  approximated  as  though  G  (t)  is  a  trapezoid  in  each 


some  stage  s'  take  the  differences  of  p.    and  p     as 

'^l.s      l,s' 

'^'3-Ts->-   i}._^     (Pi,S  -Pi,s''(Vl  -  ^i-l)-  (1-^) 

If  s'  =  s  -  1  then  ECT   -  T  , )  is  the  mean  duration  time  for  stage  s'. 

The  pj^  ^  are  independent  for  different  t.  (assuming  that  the 

population  size  is  infinitely  large),  so  an  estimator  of  the  variance 

of  E(T  )  is 
s 

-F-1 


^arCECT^))  -  l   )_^^  y;._^(l  -  ;,^^))(t.^^  -  t..^)2      (1.5 
because  Var(p   )  -  -  [p.   (1  -  p.   )].   Similarly,  an  estimator  of 


the  variance  of  E(T   -  T  ,)  i 
s    s' 

-F-1 


Var(E(T  -  T   ))  -  -  ')    -  f(p.    -  p.   ,)(!  -  (p.   -  p.   ,))] 
s    s'     4  /._  n.p'^l.s    '^i,s'        i,s   "^l.s'   J 


>^<Vl  -  ^-l^'-  <l-6> 


Boyer  and  Deaton  also  proved  that  the  absolute  bias  of  E(T  )  < 

LaxCt.  ,  -  t.)  and  so  MSE(E(T  ))  <  hmaxCt.  --  t.))^  +  Var(E(T  )). 
2     1+1    1  s     4      1+1    1  s 

Note  that  VarCE(T  ))  is  decreased  by  increasing  n.  (ie:  taking  more 

samples  at  each  t,)i  3,nd  bias  is  decreased  by  increasing  the  niamber  of 

p 
t.  in  [t-.,  t_]  (ie:  refining  the  partition  (t.).  .,  which  is 
1       U    r  1  1— U 

equivalent  to  sampling  more  frequently) . 

In  this  report,  we  extend  the  approach  of  Boyer  and  Deaton.   In 

Section  2,  we  propose  two  estimators  for  Var(T  )  and  prove  two 

relational  properties  about  the  estimators.   Section  3  contains  an 
entomological  example  to  demonstrate  the  calculations  of  the 
estimates.   A  computer  program  to  calculate  the  estimates  is  in 
Section  4.   A  comparison  of  the  estimators  with  parameter  values  from 
five  survival  distributions  using  simulation  is  presented  in  Section 
5. 


G(t) 


Vo  t, 


ti 


+1 


tf 


Figure  ].   Geometric  representation  of  the  calculation  of  E(T  ). 


2.  ESTIMATORS  OF  VAR(T  ) 
s 

We  propose  two  estimators  of  VarfT  ),  Var_(T  )  and  Var, (T  ),  and 

s       T   s  L   s 

prove  two  relational  properties.    Var(T  )  may  be  defined  as 

Var(T^)  -  2j^t(l  -  F^(t))dt  -  [E(T^)]^         (2.1) 
(see  Mood,  Graybill  and  Boes,  1974).   The  crucial  estimation  is  of 

ECTp-  2j  '  tG  (t)dt. 


2.1  Var  (T  ):  Trapezoid  Method 
Let 

^^■^s^At  "  ^  S^^*^<^G^(.t)dt  (2.2) 

i 

(ie:  the  contribution  to  the  second  moment  of  T   on  some 

s 

At  -  [t, ,  t,  -]).   Since  G  (t)  is  monotonic  nonincreasing  on  At, 

2G^(t.)/^'-''^  tdt  >  2/^.^"'^  tG^(t)dt  >  2G^(t.^j^)  /^^■'^  tdt,  where 
i  i  1 

G^(t.)  -  sup{G^(t)  I  t^  <  t  <  t.^^)  and  G^(t.^j^)  -  inf(G^(t)  |  t.  <  t 


^  ^1+1 


1.   So  G  (t.)(t   ,  -  t?)  >  21"  ^"^  tG  (t)dt  >  G  (t.  ,)(t   ,  -  t  ). 
s   1   1+1    i'    •'t.    s        s   1+1   1+1    i' 


Taking  the  average  of  left  and  right  sides  we  obtain 
^T^^s^At'  ^(°s(^'  "-^S^Vl^'^Vl  -  'i^-   ^° 

r-F-1 

S^^s'  =  \  I       <°s(^i)  ^  °s<Vl))(Vl  -  =l'         (2.3) 


Varj(T^)  -  Ej(T^)  -  [E(T^)]^  (2.4) 

where  E(T  )  is  (1.2).   Thus  an  estimator  of  Var(T  )  is 

V-F-1   A  A 

^^'^t(^s)  -  \  I      (Pi,s  ^  Pi+l,s)(^Ll  -  'i'  -  [^(^s^l'     (2.5) 
where  E(T  )  is  (1.3).   Expanding  (2.5)  did  not  result  in  a  simpler 


form,  so  we  recommend  (2.5)  in  calculating  Var„(T  ). 


2.2  Var  (T  ):  Straight  Line  Method 

Define  Var(T^)  as  (2.1)  and  E(Tp    as  (2.2).   Assume  G  (t)  is  a 

linear  function  on  At.   Then 

G  (t.)  -  G  (t.   ) 
G^(t)  -  k  +  ct  -  G^(tj^)  +   ^   '-     s   1+1  (^  .  J) 


^L^^s'At-  2/^^'^G^(^)dt 


G  (t.)  -  G  (t.  ,) 


1     ^  t.  -  t.  ,  ' 

1  i+l 

ht.)-!!!!iii!!!!i^t.i(t^-t^ 

I  s'  l' — ij'  1+1    l' 


^  2  =s(^i)  -  '^s^^+l'  ,3      3^     ^ 
+  3  (t.^^  -  t.),  and 
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2  ^F-1     SI  s      1+1      ,^3        _      3 

1+1  1 


,  r-F-1,  G    (t   )    -    G    (t        )        . 

L     s  4_q[s      1  1.    -    t.    , "-'^^^  ' 

1  1+1 

2  vF-1  °s<^i'    -   '^s^^+l      , 
4-0  h   -   ^i+1 


Var^(T^)    -  Ej^CT^)    -    [E(T^)]^  (2.7) 

where  E(T  )  is  (1.2).   Substituting  p.    for  G  (t.)  and  p.  ,    for 
s  ^"^l.S       SI       ^1+1,  s 

G  (t.  ..)  we  define  an  estimator  of  Var(T  )  by 

Varj^(Tp  -  E^(T^)  -  [E(T^)]^.  (2.8) 

2 
We  expand  E  (T  )  to  obtain  a  simpler  calculation  form  of 

Var,  (T  ).   Substituting  p.    for  G  (t.)  and  p.  ,    for  G  (t.  ,), 
L  s  ^  "^i.s      s   1      *^i+l,s      s   1+1 

2      2        3      3 
factoring  (t.  .  -  t.)  and  (t.  ..  -  t.),  and  canceling  appropriate  terras 

in  (2.6)  we  obtain 
rF-1 


.  ,   )(t.  ,  -  t.)t. 
i+l,s'^  1+1    i'  i 


E, (T^)  -  )    (p.   (t?  ,  -  t^)  +  (p. 

L^  s'   /.  .ri,s  1+1  i'    ^'^i 

+  i(p.  -  p.  ,   )(t7  ,    +  t.  ,t.  +  tf) 

3^^i,s  '^1+1, s'^  1+1, s    1+1  i    I'J 


rF-1.,  .     2  2 

,s(4l  ^  Vl^  -  2^1> 


1  '        2  2  "1 

+  5  p.,,   (2t:^,-  t.^,t.  -  tf) 
3  '^1+1,  s   1+1   1+1  1    1  J 
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F-1 


(2.9) 


We  recommend  (2.9)  in  calculating  Var  (T  ). 


2.3  Two  Relational  Properties 

Theorem  1.  Define  Var.j,(T  )  as  (2.4)  and  Var  (T  )  as  (2,7).   Given 

("i^Lo'  'i  ^  °'  ^+1  >  h   ^"'^  °s(^>  ^  '^s'^+l)  '^^"  ^"-^T^Ts^  > 
Var^(T^). 

Proof:   Note  that  proving  Var  (T  )  >  Var  (T  )  is  equivalent  to 

2        2 
proving  0  <  Ej(T^)  -  E^(T^) .   From  (2.3)  and  substituting  G  (t.)  for 

Pi,s  ^""^  ^s'^l+l^  ^""^  Pl+l,s  ^"  '■^■'^^-    "^  obtain  £.^,(1^)  -  Ej_(T^) 

-  2  5._o  ['^s^^i'  ^  '^st^i+i'KVi  ^  'i'(Vi  -  h^ 

■    I   ),_o  h<^i)(Vl  +  2t,)  ^  G^(Vi)(2t.^^+  ^l'](^i+l  -  t^) 

-  \J  I  '^s^^X^+l  +  ^>    +  k<^+l><'i+l+  ^i' 

-  k^^i'^Vi  ^  2t,)  -  K<^i+i)^2t.^^  +  ^)](Vr  ^) 

-  g  )._„    h^^X^+l    -    ^i'   ^  '^s^^i+l'^-Vl  ^  ^i)]< 


rF-1 

1(1:.    .-    t.) 

[_._j-.       ^.     ^  J-  J-T-i  X  »  x-rj.  j.-rj.  J.      J  i+1  1 

rF-l 

,)    -    G    (t,^,)](t,.^,    -    t,.)^  >  0.      Thus 


s^-l+l^'^^i+l        -i' 
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rF-1 
0  <  ^   ''^s^'^i^  '  ''s^'^i+1''' '"^i+l  "  'i^   ■   Strict  Inequality  holds 

because  G  (t)  is  nonlncreasing  monotonic,  G  (t  )  -  1  and  G  (t  )  -  0. 
Thus  Var  (T  )  >  Var  (T  )  completing  the  proof.  D 


Theorem  1  holds  for  the  sample  estimates  Var  (T  )  and  Var  (T  ) 
when  the  p.    have  the  same  monotonic  property.   Note  that  the  p. 

l.S  i-ry  l,s 

may  not  be  monotonic  because  of  sampling  variation.   However,  we  have 

always  observed  that  Var  (T  )  >  Var  (T  )  in  simulations  and  actual 

data  applications. 

In  the  following  Lemma  we  show  that  as  the  ntimber  of  t.  in 

(tg,  tp]  increases,  Var  (T  )  approaches  Var  (T  ). 

F  F 

Lemma  1.   Let  (t.).  .,  t.  >  0,  t..,  >  t.  and  (n.)!  „  be  given. 
1  1-0    1       1+1    1       1  1=0    ° 

Define  At  -  t^^^^^  -  t..   Then  Var  (T  )  -»  Var  (T  )  as  At  -  0. 

Proof:   Fix  s.   From  Theorem  1,  and  substituting  p.    for  G  ft  ) 

"       1,S  SI 

and  P,    ,        for  G    (t.      ),    we  have 

A  A  A  r-F-1       A  A 

Var^CTp    -   Varj_(Tp   -  E^d^)    -   E^df)   -  1)        (p  -   p.,        )(At)2. 

Recall    that   the   right   side   may  be   less   than  zero    if   the  p.         are 


L,  S 


not  monotonic  nonlncreasing.   Now, 
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12' 
-  ;max(At)   since  p.    -  1  and  p„   -  0. 
D  .  0,s  F,s 

12'  ' 

Thus    lira  ;max(At)   -  0  and  Var  (T  )  -  Var,  (T  )  as  At  -»  0.   □ 

AtiO    i  ^   =       '-  = 
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3.  AN  EXAMPLE 

We  provide  an  example  of  the  calculations  for  E(T  ),  E(T  -  T   ), 

s      s    s 

Var(E(T  )),  Var(E(T  -  T  . ) ) ,  Var„(T  )  and  Var,  (T  ),   The  data  are  from 

S  S     S  is  L.   S 

a  stage  frequency  experiment  on  the  insect  Lvsiphlebus  testaceipes 
(Hymenoptera:  Aphidiidae) ,  an  endoparasitold  of  the  aphid,  Schizaphls 
graminum  (Homoptera;  Aphidae)  (Table  2)  (unpublished  data,  J.  S. 
Pontius)  (see  Hight,  Eikenbary,  Miller  and  Starks  (1972)  for  L, 

testacelpes  biology).   The  p.    for  the  example  calculations  are 

included  in  Table  2.   Since  t,  -  4,  let  t„  -  0  and  p„  -  1. 

1  0         0,s 

Table  2.   Stage  frequency  data  for  L^   testaceipes . 
sample  stage        total        P-  i     P-  o 


time      0      1       2    no. 
(days)   egg-    pupa   adult 
larva 


^0-° 

- 

- 

- 

- 

1 

1 

4 

8 

0 

0 

3 

1 

1 

7 

15 

0 

0 

15 

1 

1 

9 

3 

15 

0 

18 

.16 

1 

11 

0 

14 

0 

14 

0 

1 

13 

0 

2 

12 

14 

0 

.1429 

15 

0 

0 

2 

2 

0 

0 

The  estimated  mean  time  to  stage  1  (pupa)  is  E(T  )  -  .5(4)  + 
[(1)(7  -  0)  +  (1)(9  -  4)  +  .16(11  -  7)  +  0  +  0]  -  8.3  from  (1.3). 
The  estimated  standard  error  of  the  mean  time  to  stage  1  is  Var(E(T-)) 
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-  .25[  0  +  0  +  .055(.16(1  -  .16))(11  -  7)^  +  0  +  0  +  0]  -  .031  from 
(1.5).   The  estimated  mean  duration  time  In  stage  1  is  E(T   -  T  )  - 
.5[0  +  0  +  0  +  .83(11  -  7)  +  (1)(13  -  9)  +  .1429(15  -  11)]  -  3.95  from 
(1.4).   From  (1.6),  Var(E(T^  -  T^j))  -  .25[0  +  0  +  0  +  .05(.83(1  - 

.83))(11  -  7)^  +  .071(. 1429(1  -.1429))(15  -  11)^]  -  .067.   Estimates 

'^2  2 

of  moments  for  variances  of  time  to  stage  1  are  E  (T.  )  -  .5[(1  +  1)(4 

-  0)  +  (1  +  1)(7^  -  4^)  +  (1  +  .16)(9^-  7^)  +  (.16  +  0)(11^  -  9^)  +  0 

+  0]  -  71  and  Ej^d^)  -  .3([(1)(4  +  0)  +  (1)(8  +  )  ]  (4  -  0)  +  [(1)((7  + 

8)  +  (1)(14  +  4)](7  -  4)  +  [(1)(9  +  14)  +  .16(18  +  7)](9  -  7)  + 
[.16(11  +  18)  +  0](11  -  9)  +  0  +  0])  -  70,3  from  (2.3)  and  (2.9), 
respectively.   Hence,  from  (2.5)  and  (2.8),  the  estimated  variances  of 

time  to  stage  1  are  Var  (T.)  -  1.5  and  Var  (T. )  -  .8. 
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4.  A  COMPUTER  PROGRAM  FOR  CALCULATING  ESTIMATES 
We  developed  a  computer  program  (Appendix  A)  to  calculate  ECT  ), 

Var(E(T^)),  E(T^  -  T^,),  VarCECT^  -  T^J)  {s'  -  s  -  1),  Var^(T^)  and 

Var  (T  )  from  a  set  of  stage  frequency  data  (see  Table  1) .   The 

program  was  coded  in  the  Macro  Language  of  the  Statistical  Analysis 
System  (SAS)  (Allen,  1982),  version  82.3.   SAS  Macro  Language  requires 
a  minimum  of  500  K  memory  and  the  JCL  execution  card 

//)zfEXEC)z(SAS,OPTIONS=MACRO.   The  source  code  contains  documentation  for 
SAS  data  set  structure,  computational  algorithms  and  user  required 
initial  values. 

The  SAS  data  set  structure  is  documented  in  the  program.   The 
stage  frequency  table  (see  Table  1)  is  structured  as  a  column 
formatted  input  data  set,  excluding  the  totals  column.   The  program 
checks  that  the  data  set  contains  (1)  nonnegative  count  data 
(n.    >0,  0<i<F,  0<s<A)  and  (2)  an  increasing  sequence  of 

F 
nonnegative  sample  times  (t.).^^-   Error  messages  are  printed  on  the 

SAS  log  if  errors  in  (1)  and/or  (2)  are  detected.   If  the  data  do  not 

contain  t _  -  0 ,  an  algorithm  inserts  t^  -  0,  n.  _  =  1  and  n„  -  0, 
u  u       u ,  U         u ,  s 

0  <  s  <  A,  as  the  first  row  of  the  data  set. 

User-required  initial  values  are  in  the  main  program  section 

located  at  the  end  of  the  source  code.   The  initial  values  are  (1) 

number  of  stages  in  the  data  set  (NS) ,  (2)  variables  corresponding  to 

each  stage  (COUNTk) ,  (3)  a  list  of  character  identifiers  for  the 

stages,  and  (4)  the  unit  of  measurement  for  sample  times. 


r: 
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The  estimates  are  calculated  using  matrices  in  PROC  MATRIX.   Three 
sections  of  output  are  generated  by  the  program  (Table  3) .   The  first 
section  contains  the  input  data  set.   The  variables  COUNTk  are  as 

initialized  in  (2)  above,  k-  1 ,  2 A+1  (corresponding  to  s  -  0, 

1 A),  and  the  values  listed  under  DAY  are  sample  times.   The 

second  section  lists  the  stage  identifiers  and  the  estimates  E(T  ), 

s 

y[Var(E(T^))],  E(T^  -    '^s'^    ^"'^   y[Var(E(T^  -  T^ ,  ) )  ]  f  or  s '  -  s  -  1 . 

The  third  section  contains  the  stage  identifiers,  y[Var  (T  )]  and 

y[Varj^(T^)  ]  .   Estimates  for  the  pupal  stage  in  Table  3  correspond  to 
the  example  calculations  in  Section  3. 


""fT 
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Table  3 .   Example  of  computer  program  output . 
the  example  in  Section  3. 


Estimates  correspond  to 


ESTIMATION  OF  TIME  TO  AND  DURATION  OF  STAGE 

FREQUENCY  DATA,  METHOD  OF  BOYER  AND  DEATON. 

PROGRAM  REVISED:  SEPTEMBER,  1986  BY  JS  PONTIUS. 

STUDY:  L.  TESTACEIPES 

UNIT  OF  TIME  MEASUREMENT:  DAYS 


)BS 

DAY 

COUNTl 

C0UNT2 

COUN' 

1 

4 

8 

0 

0 

2 

7 

15 

0 

0 

3 

9 

3 

15 

0 

4 

11 

0 

14 

0 

5 

13 

0 

2 

12 

6 

15 

0 

0 

2 

ESTIMATION  OF  TIME  TO  AND  DURATION  OF  STAGE 

FREQUENCY  DATA,  METHOD  OF  BOYER  AND  DEATON. 

PROGRAM  REVISED:  SEPTEMBER,  1986  BY  JS  PONTIUS. 

STUDY:  L.  TESTACEIPES 

UNIT  OF  TIME  MEASUREMENT:  DAYS 

STAGE      TIME  TO   STD  ERROR  OF   DURATION  TIME   STD  ERROR  OF 

REACH  STAGE    E(T(S))    E(T(S))  -  T(S'))  E(T(S)  -  T(S')) 
(  E(T(S))  ) 


EGGLARVA  .  8.33333 

PUPA       8.3333     0.175682       3.95238 
ADULT      12.2857     0.187044 


0.175682 
0.256612 


ESTIMATION  OF  TIME  TO  AND  DURATION  OF  STAGE 

FREQUENCY  DATA,  METHOD  OF  BOYER  AND  DEATON. 

PROGRAM  REVISED:  SEPTEMBER,  1986  BY  JS  PONTIUS. 

STUDY:  L.  TESTACEIPES 

UNIT  OF  TIME  MEASUREMENT:  DAYS 


STAGE 


STD  DEVIATION  OF  T(S) 
-TRAPEZOID  ANALOG- 


STD  DEVIATION  OF  T(S) 
-STRAIGHT  LINE- 


PUPA 

ADULT 


1.24722 
1.22057 


0.942809 
0.907265 
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5.  SIMULATION  UNDER  FIVE  SURVIVAL  DISTRIBUTIONS 

We  compare  the  performance  of  E(T  ) ,  Var„CT  ) ,  Var^ (T  )  and 

s      T   s      L  s 

Var(E(T  ))  to  their  respective  expected  values  E(T  )  and  Var(T  )  (for 

the  first  3  estimators)  of  5  survival  (G  (t))  distributions  under 

s 

specified  sampling  conditions  for  a  cohort  of  organisms  with  2  stages, 

s  -  {0,  1).   We  selected  the  uniform,  exponential,  beta,  normal  and 

standard  gamma  survival  distributions  to  provide  a  variety  of 

distributional  shapes  (Table  4) . 

Table  4.   Survival  G  (t) ,  parameter  value(s) ,  E(T  )  and  Var(T  )  for 
s  s  s 

each  survival  distribution  used  in  simulations. 

survival  parameter  E(T  )  Var(T  ) 

distribution  value(s) 


uniform:    G^(t)  -  (1  -  c)l^^   ^^(t)  -  0.5   1/712 

exponential:  G  (t)  -  e'^'l,.   .(t)  X  -  I  1.0    1.0 

S  [0,00) 

beta:  G  (t)-  1  -  J^   ^   x°'-'-(l-x)^'^I  .„  ,  ,  (x)  a  -   2.0   0.8    0.046 

"^  "  B(a,/3)  l°'^i  fi  -   0.5 

normal : 

G,(t)-  1  -  ;S  ^exp[-^[ll^]2]l     (X)  .-3.5  3.5    1.0 

gamma:  G^Ct)=  1  -  J^  J_  x'^'^e'^'l^^  ^^  (x)  a   =  2.0   2.0    2.0 


We  selected  the  uniform  because  we  expected  E(T  )  to  closely 
estimate  E(T  )  since  the  trapezoidal  approximation  should  do  well  on 
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the  uniform's  straight  line  survival  distribution  (Fig.  2  and  see  Fig. 

1).   We  expected  E(T  )  to  overestimate  E(T  )  from  an  exponential  G  (t) 

because  the  trapezoidal  approximation  should  overestimate  the  area 
under  the  convex  exponential  survival  distribution  (Fig.  3). 

Conversely,  we  expected  E(T  )  to  underestimate  E(T  )  from  a  concave 

beta  survival  distribution  (Fig.  4).   Because  a  normal  distribution  is 
symmetric  about  E(T  ) ,  we  expected  underestimation  of  the  trapezoids 

at  the  upper  tail  and  overestlmatlon  of  trapezoids  at  the  lower  tail 
(Fig.  5).   Hence,  possibly  some  cancelling  effect  would  occur.   We 
selected  a  gamma  because  it  is  a  widely  used  survival  distribution  and 
exhibits  shape  asymmetry  (Fig.  6). 

The  variables  we  used  In  sampling  from  each  survival  distribution 

were  (1)  the  time  interval  (At)  between  sample  times  (t.)!   and  (2) 

the  number  of  organisms  (n.)  in  stages  0  and  1  'observed'  at  each  t. . 

To  construct  a  similar  sampling  regimen  over  distributions.  At  was 
scaled  with  respect  to  Var(T  )  of  each  distribution  as 

At  -  cyVar(T^),  (5.1) 

where  c  -  1,  1/2,  1/4  or  1/8.   Thus  At  was  constant  for  each 
simulation  (ie:  sample  times  were  equidistant).   Note  that  At  is  just 

a  proportion  of  the  standard  deviation  of  each  G  (t) .  Also,  the  number 

s 

of  samples  (ie:  the  'number  of  organisms  observed')  per  t.  were  n.  - 

5,  10,  20  or  40  for  all  t. ,  i  -  0, . . . ,F.   For  brevity,  let  n  -  n. , 
1  ■'  '  i' 

i  -  0 F.   Thus  the  performance  of  the  estimators  was  evaluated 
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based  on  16  combinations  of  c  and  n,  for  each  survival  distribution. 
Fifty  'cohorts'  were  sampled  for  each  combination  of  c  and  n. 
The  statistics  we  used  to  evaluate  the  performance  of  the 

estimators  were  mean  and  root  mean  square  of  E(T  ),  Var„(T  )  and 

s       T   s 

Varj^(T^)  each  and  the  mean  of  Var(E(T  )).   Clearly,  the  sample  size 

for  the  mean  and  root  mean  square  of  each  estimator  is  50. 

5.1  Simulation  Algorithm 

We  describe  the  algorithm  used  in  our  simulations.   We  used  PROC 
MATRIX  of  SAS,  version  5,  (processed  on  a  NAS  6630)  because  our 
computer  algorithms  to  calculate  estimates  were  already  coded  in  SAS 
(see  Section  4) .   Version  5  was  used  because  the  RANBIN  binomial 
random  number  generator  in  version  82.3  was  defective.   We  were 
convinced  that  RANBIN  in  version  5  was  correct  after  several  tests  on 
RANBIN  were  performed. 

The  simulation  algorithm  is  as  follows.   Algorithm  instructions 
apply  to  each  survival  distribution.   First  intialize  E(T  ),  Var(T  ), 

p 
c  and  n.   Next  generate  the  sample  times  (t.).  „.   Set  t.  -  0. 

'^  11-0        0 

F-1 

Generate  sample  times  (t.).  ,  by  t.  ,  -  t.  +  At,  i-  0 F-2,  where 

11-1  ■'   1+1     1  .    .    , 

At  is  determined  from  (5.1).   To  ensure  that  p„   -  0,  calculate  t„ 

F,  s  F 

for  uniform  and  beta  survival  distributions  as  t^  =  t^  ,  +  At  where 

F     F-I 

t__.  -  max{t,  €  [0,1]}.   Then  calculate  t   for  exponential,  normal  and 
gamma  survival  distributions  as  t  -  t   ..  +  At  where 


■»37^p 
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t„  ,  -  max{t.  I  P(T  <  t.)  <  0,9999). 
F-1        1  '     s     1 

For  each  t.,  determine  a  corresponding  p.     Set  p-   -  1.   The 
^i  0'  ^  ~  ^'•••'^"1'  f°'^  uniform  and  exponential  are  determined  from 

survi-val  distributions  (see  Table  4).   The  p    ,  i  -  1 F-1,  for 

beta,  normal  and  gamma  are  calculated  as  p    -  1  -  P(T  <  t.)   where 
P(T  <  t  )  are  determined  from  the  SAS  probability  generators 

PROBBETA,  PROBNORM  and  PROBGAM,  respectively.   Parameter  values  for 

the  probability  generators  are  listed  in  Table  4.   Set  all  p„  .  -  0. 

F ,  0 

Now  determine  the  number  of  organisms  'observed'  in  stage  0.   For 

each  of  the  50  cohort  simulations  do  the  following.   Initialize  all 

"j^  g  -  0  for  each  survival  distribution.   Set  n.   -  n  (ie:  all 

organisms  are  in  stage  0  at  t  -  0) .   For  each  t. ,  1-1 F, 

randomly  select  the  number  of  organisms  in  stage  0  using  the  binomial 
random  number  generator  RANBIN  with  parameters  p.   and  n.   Stop 

sampling  when  the  first  n.  „  -  0,  i  >  0.    Calculate  n.  ,  -  n  -  n.  „, 
1.0  1,1        1,0 

i  -  0,...,F,  to  determine  the  number  of  organisms  that  have  reached 

stage  1  by  t^^.   Calculate  E(T^) ,  Var(E(T  )),  Var  (T  )  and  Var  (T  )  and 

store  the  estimates. 

After  50  cohorts  have  been  simulated,  calculate  the  evaluation 
statistics  specified  above.  Print  the  evaluation  statistics.  This 
ends  the  algorithm. 
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5.2  Results  and  Discussion  of  Simulations 

We  present  the  simulations'  results  and  corresponding  discussion 

in  the  order  E(T^)  ,  Var(E(T^)),  Var.j,(T^)  and  Var  (T  ).   For  each 

estimator  we  present  its  overall  performance  and  then  present  results 
pertinent  to  selected  survival  distributions.   Each  sampling  regimen 
will  be  referenced  by  c,  each  sample  size  per  t.  will  be  referenced  by 

n,  and  specific  combinations  of  c  and  n  will  be  referenced  by  (c  ;  n) . 
Section  5.4  contains  an  overall  discussion  of  the  simulations. 

5.2.1  E(T  ) 

Our  overall  evaluation  of  the  performance  of  E(T  )  is  based  on 

trends  in  the  means  and  estimated  root  mean  squares  (RMS)  of  E(T  ) 

s 

(Table  5)  and  95%  confidence  intervals  for  E(T  )  (Table  6).   We 
constructed  95%  confidence  intervals  for  E(T  )  by  mean(E(T  ))  ± 

1.96y[mean(Var(E(T  )))/50]  where  values  of  mean(Var(E(T  )))  are  listed 

in  Table  7  and  50  is  the  number  of  simulations  for  each  survival 

distribution,  c  and  n  combination.   If  E(T  )  is  contained  in  the 

s 

confidence  interval  then  we  consider  E(T  )  to  be  a  good  estimator  of 

s 

E(T  )  under  the  particular  survival  distribution,  c  and  n  combination. 


Note  that  the  confidence  intervals  can  also  be  used  to  test  H  :  E(T  ) 

o     s 

is  an  unbiased  estimate  of  E(T  )  vs .  H  :  E(T  )  is  a  biased  estimate  of 
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ECT  ).   If  H  is  rejected,  the  bias  can  be  estimated  by  bias  - 

SO"'  ■' 

mean(E(T  ))  -  E(T  )  from  the  values  in  Table  5. 

Overall,  E(T  )  is  contained  all  confidence  intervals  for  the  beta 
s 

survival  distribution  (Table  6) ,  E(T  )  is  contained  in  most  confidence 

s 

intervals  for  the  uniform  and  normal  survival  distributions,  and  ECT  ) 

is  contained  in  less  than  half  of  the  confidence  intervals  for  the 
exponential  and  gamma  survival  distributions.   Thus  we  conclude  that 

E(T  )  is  a  good  estimator  of  the  expected  values  £(1  )  for  the 

beta  and  uniform  survival  distributions,  and  for  (1  ;  5,  10,  20,  40), 
(1/2  ;  20,  40),  (1/4  ;  10,  20,  40)  and  (1/8  ;  20,  40)  for  the  normal 

survival  distribution.   E(T  )  appears  to  be  a  biased  estimator  of 

E(T  )  of  the  exponential  and  gamma  survival  distributions  when  c  -  1/4 

or  c  -  1/8.   Even  though  the  bias  of  E(T  )  should  decrease  as  At 

decreases  (see  Section  1),  this  assumes  that  all  t.  have  been  sampled. 

Possibly  the  presence  of  bias  when  c  -  1/4  and  c  -  1/8  for  the 
exponential  and  gamma  survival  distributions  is  because  of  some  t, 

(in  the  right  part  of  the  survival  distribution)  consistently  not 
being  sampled  in  simulations. 

For  the  exponential,  normal  and  gamma  survival  distributions,  the 

means  of  E(T  )  tended  to  decrease  as  c  decreased  and  n  increased, 
s 

However,  the  means  of  E(T  )  tended  to  be  similar  across  all  (c  ;  n) 
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for  the  uniform  and  beta  survival  distributions.   Means  of  E(T  )  were 
closer  overall  to  E(T  )  of  beta  and  uniform  survival  distributions. 

Conversely,  means  of  E(T  )  were  farther  overall  from  E(T  )  of  the 
^  s  ^  s 

gamma  survival  distribution.   Possibly  E(T  )  better  estimates  E(T  )  of 

s  s 

survival  distributions  with  shapes  similar  to  the  concave  beta  and 

uniform  survival  distributions.   But  the  [0,  1]  domains  of  the  uniform 

and  beta  survival  distributions  compared  with  the  [0,  »[  domains  of 

the  exponential,  normal  and  gamma  survival  distributions  may  have 

influenced  these  results.   Estimated  root  mean  squares  of  E(T  )  tended 

to  decrease  as  c  decreased  and  n  increased  (Table  5) . 

For  the  exponential  survival  distribution,  EfT  )  overestimated 

s 

E(T  )  for  (1  ;  5,  IQ,  20,  40)  as  we  expected  but  E(T  )  tended  to 

underestimate  E(T  )  for  (1/2,  1/4,  1/8  ;  5,  10,  20).   For  the 


survival  distribution,  E(T  )  tended  to  underestimate  E(T  ),  especially 
for  smaller  n  and  c. 

1.  >c^   •  ■.  -  ■,  »  s 

5.2.2  Var(E(T  )) 
s 

Boyer  and  Deaton  (1984)  concluded  that  Var(E(T  ))  would  be 

decreased  by  taking  more  samples  (ie:  increasing  n.)  at  each  t, . 

Means  of  Var(E(T  ))  from  the  simulations  (Table  7)  support  their 
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A       A 


conclusion.   For  each  survival  distribution  and  c,  means  of  Var(E(T  )) 

9 

decreased  as  n  increased;  and  for  each  survival  distribution  and  n, 

A        A 

means  of  Var(E(T  ))  decreased  as  c  decreased.   Overall,  we  conclude 
that  increasing  n.  at  each  t.  and  increasing  the  number  of  t.  in 

»  A 

'  [t^,  t„]  results  in  a  smaller  estimate  of  Var(E(T  )). 
u   f  s 


5.2.3  Var„(T  ) 
1   s 

A 

The  Var_(T  )  tended  to  overestimate  Var(T  )  of  each  survival 
T  s  s 

distribution  when  c  -  1,  and  for  smaller  n  (usually  n  -  5  or  n  -  10) 

A 

for  all  other  values  of  c  (Table  8).   Overall,  Var_(T  )  better 

T   s 

estimated  Var(T  )  for  smaller  c  and  larger  n  under  the  uniform,  beta 
and  normal  survival  distributions;  and  for  smaller  c  under  the 

A 

exponential  and  gamma  survival  distributions.   Means  of  Var  (T  )  were 

closer  overall  to  Var(T  )  of  the  beta  survival  distribution  and 

s 

farther  from  Var(T  )  of  the  exponential  and  gamma  survival 

A 

distributions.   So  Var_(T  )  better  estimated  Var(T  )  of  beta  and 

is  s 

A  A 

uniform  survival  distributions.   Trends  in  RMS  of  Var„(T  )  were 

1   s 

variable  and  appear  to  depend  on  the  particular  survival  distribution. 

A 

For  the  uniform,  survival  distribution,  Var_(T  )  overestimated 

T  s 

Var(T  )  when  c  -  1,  (1/4  ;  5)  and  (1/8  ;  5).   For  all  n  when  c  =  1/2 


27 
and  for  (1/4,  1/8  ;  10,  20,  40)  Var,|,(T^)  reasonably  estimated  Var(T  ). 

The  RMS's  of  Var  (T  )  tended  to  decrease  as  c  decreased  and  n 
r  s 

increased. 

For  the  exponential  survival  distribution,  Var„(T  )  varied 

T   s 

considerably  with  changes  in  n  within  and  across  values  of  c.   Means 
of  Var,j,(T^)  were  closer  to  Var(T  )  for  (1  ;  20,  40).   For  all  other 

(c  ;  n) ,  Var,j,(T^)  underestimated  Var(T  ).   In  (1/2  ;  5)  and  (1/4,  1/8 

;  5,  10)  Var  (T  )  underestimated  Var(T  )  considerably.   The  RMS  of 

Var  (T  )  decreased  as  n  increased  for  each  c. 
1  s 

For  the  beta  survival  distribution,  Var  (T  )  reasonably  estimated 
Var(T  ) ,  particularly  as  c  decreased  and  n  increased.   The  RMS  of 


Var„(T  )  decreased  as  c  decreased  and  n  increased. 
1   s 

For  the  normal  survival  distribution,  Var_(T  )  overestimated 

T  s 

Var(T  )  when  c  -  1.   For  all  other  (c  ;  n) ,  except  for  (1/4,  1/8  ;  5), 


Var_(T  )  reasonably  estimated  Var(T  ).   The  RMS  of  Var.„(T  )  decreased 
is  s  T   s 

as  n  increased  for  each  c. 

For  the  gamma  survival  distribution,  Var  (T  )  varied  considerably 

with  changes  in  n  within  and  across  values  of  c.   Means  of  Var  (T  ) 
were  closer  to  Var(T  )  for  (1  ;  5,  10)  and  (1/2  ;  40).   In  (1/4  ;  5) 
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and  (1/8  ;  5,  10),  Var  (T  )  underestimated  Var(T  )  considerably.   The 


RMS  of  Var  (T  )  decreased  as  n  increased  for  each  c. 


5.2.4  Var^(T^) 

Means  of  Var  (T  )  (Table  9)  were  less  than  the  means  of  Var„(T  ) 

Li   S  T   S 

for  all  survival  distributions  and  (c  ;  n)  (equality  is  the  result  of 
rounding  of  estimates)  as  proved  in  Theorem  1  (see  Section  2.3).   In 

general,  statements  about  Var_(T  )  in  Section  5.2.3  pertain  to 

T   s 

Var^ (T  ).   As  c  decreased  and  n  increased,  Var, (T  )  approached 
L  s  L  s 

Var  (T  )  as  proved  in  Lemma  1  (see  Section  2.3).   The  ElMS's  of  both 

estimators  were  similar  for  smaller  c  (especially  c  -  1/8).   Hence, 

Var  (T  )  and  Var  (T  )  give  similar  estimates  of  Var(T  )  as  the  number 

of  t,  in  [t-.,  t  ]  and  n,  are  increased. 
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Figure  2.  Graph  of  uniform  G  (t)  used  In  simulations. 
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Figure  3.   Graph  of  exponential  G  (t)  used  in  simulations. 
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Figure  4.      Graph  of   beta  G    (t)   used   In  simulations. 
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Figure  5,   Graph  of  normal  G  (t)  used  in  simulations. 
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Figure  6.   Graph  of  gamma  G  (t)  used  in  simulations. 
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Table  5.   E(T  )  results  from  survival  distribution  simulations.   Mean 
s 

of  E(T  )  and  (root  mean  square)  are  listed  for  each  c  and  n 

combination  under  each  survival  distribution. 


DISTRIBUTION: 

uniform 

exponential 

beta 

normal 

gamma 

E(T^): 

.5 

1 

.0 

.8 

3.5 

2.0 

c    n 

.50 

1 

.06 

.80 

3.45 

1    5 

1.88 

10 

(.10) 
.53 

1 

.35) 
.02 

(.07) 
.79 

(  .34) 
3.53 

(  .46) 
1.96 

20 

(.09) 

.51 

1 

.21) 
.06 

(.05) 
.80 

(  .28) 
3.53 

(  .29) 
2.05 

40 

(.05) 

.51 

1 

.16) 
.08 

(.04) 
.80 

(  .17) 
3.53 

(  .24) 
1.98 

(.04) 

.14) 

(.02) 

(  .11) 

(  .18) 

1    5 

.48 

.83 

.80 

3.41 

1.81 

2 

10 

(.08) 
.50 

.31) 
.95 

(.05) 
.81 

(  .28) 
3.43 

(  .39) 
1.90 

20 

(.06) 
.50 

.18) 
.99 

(.04) 
.80 

(  .21) 
3.51 

(  .25) 
1.98 

40 

(.03) 
.50 

1 

.09) 
.00 

(.03) 
.80 

(  .12) 
3.50 

(  .19) 
1.99 

(.02) 

.06) 

(.02) 

(  .10) 

(  .11) 

1    5 

.50 

.81 

.79 

3.45 

1.78 

4 

10 

(.06) 
.50 

.28) 
.90 

(.03) 
.80 

(  .22) 
3.47 

(  .33) 
1.90 

20 

(.04) 
.50 

.16) 
.92 

(.02) 
.80 

'  .10) 
3.50 

(  .19) 
1.94 

40 

(.02) 
.50 

.12) 
.99 

(.02) 
.80 

.07) 
3.49 

(  .13) 
1.97 

(.02) 

.07) 

(.01) 

.07) 

(  .10) 

1    5 

.44 

.76 

.79 

3.34 

1.65 

8 

10 

(.09) 
.49 

.29) 
.84 

(.03) 
.80 

.22) 
3.43 

(  .42) 
1.78 

20 

(.04) 
.50 

.20) 
93 

(.02) 
.80 

.12) 
3.49 

(  .25) 
1.94 

40 

(.02) 
.50 

10) 
96 

(.01)     ( 
.80 

.05) 
3.49 

(  .12) 
1.96 

(.01) 

05) 

(.01)     ( 

.04) 

(  .09) 
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Table  6.   95%  confidence  intervals  for  E(T  )  based  on  simulations. 

s 

Confidence  intervals  are  listed  for  each  c  and  n  combination  under 

each  survival  distribution.   *  labels  confidence  intervals  that  do  not 

contain  E(T  )  . 


DISTRIBUTION: 

uniform 

exponential 

beta 

normal 

gamma 

E(Tp: 

.5 

1.0 

.8 

3.5 

2.0 

1 

5 

(.48, 

.53) 

.99,1.13) 

(.78 

.82) 

10 

(.51, 

.54)* 

.97,1.07) 

(.78 

.80) 

20 

(.50, 

.52) 

1.02,1.10)* 

(.79 

.81) 

40 

(.50, 

.52) 

1.05,1.11)* 

(.79 

.81) 

1 

5 

(.46, 

.50) 

.79,  .87)* 

(.79 

.81) 

2 

10 

(.48, 

.51) 

.91,  .99)* 

(.80 

.82) 

20 

(.49, 

.51) 

.96,1.02) 

(.79 

.81) 

40 

(.49, 

.51) 

.98,1.02) 

(.80 

.81) 

1 

5 

(.49, 

.51) 

.78,  .84)* 

(.79 

.81) 

4 

10 

(.49, 

.51) 

.87,  .93)* 

(.80 

.81) 

20 

(.49, 

.51) 

.90,  .94)* 

(.80 

.81) 

40 

(.50, 

.51) 

.98,1.01) 

(.80 

.80) 

1 

5 

(.43, 

.45)* 

.73,  .78)* 

(.78 

.80) 

8 

10 

(.48, 

.50) 

.82,  .86)* 

(.80 

.81) 

20 

(.50, 

.51) 

.92,  .94)* 

(.80 

.80) 

40 

(.50, 

.50) 

.95,  .97)* 

(.80 

.80) 

(3.36,3.54)  (1.77,1.99)* 

(3.47,3.59)  (1.88,2.04) 

(3.48,3.58)  (1.99,2.11) 

(3.50,3.56)  (1.94,2.02) 

(3.35,3.47)*  (1.74,1.88)* 
(3.39,3.47)*  (1.84,1.96)* 
(3.48,3.54)   (1.94,2.02) 
(3.48,3.52)   (1.96,2.02) 

(3.41,3.49)*  (1.73,1.83)* 

(3.44.3.50)  (1.86,1.94)* 
(3.48,3.52)   (1.91,1.97)* 

(3.47.3.51)  (1.95,1.99)* 

(3.31,3.37)*  (1.62,1.68)* 
(3.41,3.45)*  (1.75,1.81)* 
(3.47,3.51)  (1.92,1.96)* 
(3.48,3.50)   (1.95,1.98)* 
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Table  7.   Var(E(T^))  results  from  survival  distribution  simulations. 

Mean  of  Var(E(T^))  Is  listed  for  each  c  and  n  under  each  survival 
distribution. 


DISTRIBUTION: 

uniform 

exponential 

beta 

normal 

ganuna 

c 

1 

n 
5 

.0080 

.0576 

.0037 

.1024 

.1517 

10 

.0044 

.0327 

.0023 

.0531 

.0879 

20 

.0023 

.0193 

.0011 

.0274 

.0489 

40 

.0011 

.0103 

.0006 

.0138 

.0254 

1 

5 

.0038 

.0250 

.0020 

.0474 

.0710 

2 

10 

.0022 

.0190 

.0011 

.0239 

.0442 

20 

.0011 

.0108 

.0006 

.0135 

.0244 

40 

.0006 

.0055 

.0003 

.0069 

.0130 

1 

5 

.0017 

.0134 

.0010 

.0195 

.0311 

4 

10 

.0011 

.0087 

.0005 

.0120 

.0210 

20 

.0006 

.0051 

.0003 

.0066 

.0117 

40 

.0003 

.0029 

.0001 

.0034 

.0062 

1 

5 

.0008 

.0059 

.0005 

.0087 

.0127 

8 

10 

.0005 

.0040 

.0003 

.0057 

.0093 

20 

.0003 

.0025 

.0001 

.0032 

.0057 

40 

.0001 

.0014 

.0001 

.0017 

.0030 
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Table  8.   Var  (T  )  results  from  survival  distribution  simulations. 
1  s 

Mean  of  Var.j,(T^)  and  (root  mean  square)  are  listed  for  each  c  and  n 

combination  under  each  survival  distribution. 


DISTRIBUTION: 

uniform 

exponential 

beta 

normal 

gamma 

Var(T^): 

.083 

1 

.0 

.046 

1.0 

2.0 

c    n 
1    5 

.108 

.74 

.053 

1.39 

1.90 

(.056) 

.48) 

(.027) 

(  .83) 

(1.02) 

10 

.108 

.84 

.062 

1.35 

2.17 

(.040) 

.47) 

(.024) 

(  .63) 

(  .80) 

20 

.111 

1 

.01 

.060 

1.34 

2.43 

(.034) 

.35) 

(.022) 

(  .46) 

(  .81) 

40 

.105 

1 

.12 

.062 

1.33 

2.54 

(.025) 

.32) 

(.018) 

(  .40) 

(  .77) 

1    5 

.078 

.41 

.047 

.99 

1.31 

2 

(.029) 

.66) 

(.023) 

(  .48) 

(1.02) 

10 

.089 

.68 

.049 

.92 

1.69 

(.023) 

.44) 

(.015) 

(  .23) 

(  .74) 

20 

.086 

.83 

.051 

1.06 

1.85 

(.015) 

.28) 

(.012) 

(  .22) 

(  .53) 

40 

.088 

.88 

.053 

1.07 

2.09 

(.010) 

.21) 

(.010) 

(  .15) 

(  .38) 

1    5 

.069 

.35 

.048 

.72 

.91 

4 

(.032) 

.68) 

(.017) 

(  .41) 

(1.18) 

10 

.080 

.50 

.045 

.90 

1.34 

(.014) 

.54) 

(.010) 

(  .28) 

(  .77) 

20 

.083 

.64 

.047 

.97 

1.64 

(.013) 

.40) 

(.005) 

(  .16) 

(  .54) 

40 

.086 

.85 

.047 

.98 

1.72 

(.008) 

.23) 

(.005) 

(  .11) 

(  .38) 

1    5 

.057 

.27 

.043 

.62 

.64 

8 

(.036) 

.75) 

(.012) 

(  .46) 

(1.39) 

10 

.076 

.42 

.047 

.80 

1.08 

(.019) 

.61) 

(.008) 

(  .26) 

(1.01) 

20 

.085 

.62 

.046 

.91 

1.43 

(.009) 

.40) 

(.004) 

(  .15) 

(  .63) 

40 

.083 

.75 

.047 

.96 

1.69 

(.006) 

.28) 

(.004) 

(  .09) 

(  .40) 
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Table  9.   Var  (T  )  results  from  survival  distribution  simulations. 
L  s 

Mean  of  Var. (T  )  and  (root  mean  square)  are  listed  for  each  c  and  n 

combination  under  each  survival  distribution. 


DISTRIBUTION: 

uniform   exponential 

beta      normal 

gamma 

Var(T^): 

.083 

1.0 

.046 

1.0 

2.0 

c    n 
1    5 

.095 

.58 

.045 

1.22 

1.57 

(.051) 

.59) 

(.026) 

.76) 

(1.11) 

10 

.094 

.67 

.054 

1.18 

1.83 

(.034) 

.55) 

(.020) 

.56) 

(  .80) 

20 

.097 

.84 

.053 

1.17 

2.10 

(.024) 

.39) 

(.018) 

.35) 

(  .69) 

40 

.091 

.96 

.054 

1.16 

2.21 

(.015) 

.30) 

(.012) 

.28) 

(  .59) 

1    5 

.075 

.37 

.045 

.95 

1.23 

2 

(.030) 

.70) 

(.023) 

.48) 

(1.08) 

10 

.086 

.64 

.047 

.88 

1.60 

(.022) 

.47) 

(.014) 

.31) 

(  .78) 

20 

.083 

.79 

.049 

1.02 

1.76 

(.015) 

.31) 

(.010) 

.22) 

(  .56) 

40 

.084 

.84 

.051 

1.02 

2.00 

(.009) 

.24) 

(.009) 

.13) 

(  .36) 

1    5 

.068 

.34 

.048 

.71 

.89 

4 

(.032) 

.69) 

(.017) 

.41) 

(1.20) 

10 

.079 

.49 

.044 

.89 

1.32 

(.014) 

.55) 

(.010) 

.29) 

(  .79) 

20 

.083 

.63 

.046 

.96 

1.62 

(.013) 

.41) 

(.005) 

.16) 

(  .57) 

40 

.085 

.84 

.046 

.97 

1.70 

(.008) 

.24) 

(.005) 

.11) 

(  .39) 

1    5 

.057 

.27 

.042 

.62 

.64 

8 

(.036) 

.75) 

(.012) 

.46) 

(1.40) 

10 

.075 

.41 

.047 

.80 

1.07 

(.019) 

.61) 

(.008) 

.27) 

(1.01) 

20 

.085 

.62 

.046 

.91 

1.44 

(.010) 

.41) 

(.004) 

.15) 

(  .63) 

40 

.083 

.75 

.047 

.96 

1.69 

(.006) 

.28) 

(.004) 

.09) 

(  .41) 
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5.3  Regressions  on  Estimated  Root  Mean  Squares 

We  describe  the  estimated  root  mean  squares  (RMS)  of  E(T  ), 

s 

Var_,(T  )  and  Var,  (T  )  as  a  linear  function  of  c  and  n.   Our  obiective 
1   s         L  s  -" 

is  to  determine  the  response  of  RMS  in  relation  to  the  changes  in  At 
and  n. .   For  each  estimator,  the  regression  models  were  selected  based 

on  (1)  the  form  of  surface  plots  of  RMS,  c  and  n,  (2)  the  contribution 
of  significant  (q  -  .05)  linear,  quadratic  and  crossproduct  terms  to 

2 
the  full  model  r  using  SAS  PROC  RSREG  (Allen,  1982),  (3)  (a)  PROG  REG 

for  only  linear  terms,  or  (b)  backward  elimination  regressions  using 

PROC  STEPWISE  (inclusion  of  term  in  model  at  a  -  .05)  for  models  in 

(2)  having  quadratic  and/or  crossproduct  terms,  and  (4)  the  same 

(possibly  transformed)  covariates  n  and  c  for  each  model  for  each 

estimator.   Models  with  the  following  covariates  were  considered: 

{1/n,  c},  {1/n,  Jc),    {1/n,  c^},  {1/yn.  c},  [l/Jn,    Jc) ,    {1/Jn,    c^),  and 
{1/yn,  1/c). 

We  determined  that  the  models  with  covariates  1/n  and  c 
satisfactorily  met  the  criteria  in  (1)  to  (4)  above  (Table  10). 

Surface  plots  of  RMS's  vs.  c  and  n  for  Var^(T  )  and  Var^ (T  )  Indicated 

Is  L   S 

three  groups  of  survival  distributions  based  on  differences  in  surface 

shapes .   The  groups  of  survival  distributions  were  (1)  beta,  (2) 

exponential  and  gamma,  and  (3)  normal  and  uniform.   Groups  were  the 

same  for  both  variance  estimators.   The  differences  in  surface  shapes 

(groups)  are  represented  by  the  different  regression  models  (Table 

10).   From  the  models,  we  conclude  that,  overall,  the  estimators  will 
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better  estimate  their  respective  parameter  values  as  the  the  number  of 
t,  in  [t-.,  t  ]  is  increased  and  as  the  sample  size,  n,  ,  for  each  t.  is 

increased. 


2 

Table  10.   Regression  models  for  RMS.   Model  and  r   for  each 

combination  of  estimator  and  survival  distribution,  G  (t) ,  are  listed. 

s 

All  parameter  values  are  different  from  zero  at  a  -  .05. 


Estimator    Survival  Model  r 

Distribution 

.89 

.97 
.95 
.94 
.92 

.97 

n)  -  .34c  -  11.89(l/n  ) 

+   .36c^-  1.89(l/n)c  .97 

gamma      .42  +  5.83(l/n)  -  .73c  +  1.03c^ 

+  4.49(l/n)c  .97 

normal     .08  +  2.02(l/n)  -  .32c  +  .62c^  .98 

uniform  .14(l/n)  +  .02c^  .96 

Var^(T^)       beta  .07(l/n)  +  .01c  .95 

exponential   .18  +  5.96(l/n)  -  .32c  -  13.59(l/n^) 

+  .32c^  -  l.ll(l/n)c  .98 
gamma      .34  +  8.17(l/n)  -  .66c  -  11.66(l/n^) 

+  .77c^  -  2.72(l/n)c  .98 

normal      .03  +  2.14(l/n)  +  .25c^  .96 

uniform  .15(l/n)  +  .Olc^  .94 


E(T^ 

) 

beta 

.16(l/n)  + 

.03c 

exponential 

.02  +  1.29(l/n)  + 

.07c 

gamma 

.04  +  1.58(l/n)  + 

.10c 

normal 

1.07(l/n)  + 

.14c 

uniform 

.35(l/n)  + 

.03c 

Var, 

(T,) 

beta 

.06(l/n)  + 

.02c 

exponential 

.19 

+  5.74(l/n)  -  .34c 

-  i: 
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5.4  Conclusions  Based  on  Simulations 

E(T^),  Var^(T^)  and  Var  (T  )  performed  similarly  in  relation  to 
the  shapes  of  the  survival  distributions  used  in  the  simulations. 

Based  on  means  and  RMS  of  the  estimators,  these  estimators  best 
estimated  their  respective  expected  values  for  the  concave  beta  and 
uniform  survival  distributions.   These  estimators  performed  worst 
under  the  gamma  survival  distribution.   Because  in  applications  the 
shape  of  the  survival  curve  is  rarely  known,  possibly  graphing  the 

relevant  p    and  observing  the  shape  of  the  graph  would  aid  the 

researcher  in  evaluating  how  'good'  the  estimators  may  be  in  his/her 
particular  experiment. 

E(T  )  was  a  good  estimator  of  E(T  )  for  each  combination  of  At  and 

n. ,  i  -  0 F,  for  beta,  uniform  and  normal  survival  distributions. 

1  s 

Overall,  the  larger  the  number  of  t   in  [t.,  t  ]  and  the  larger  n. , 

1      U   r  i 

the  closer,  on  average,  E(T  )  will  estimate  E(T  ).   Even  though  E(T  ) 

is  a  biased  estimator  of  E(T  )  for  the  exponential  and  gamma  survival 

distributions,  the  preceeding  recommendation  holds  because  of  the 

decrease  in  RMS's.   Also  the  decrease  in  Var(E(T  ))'s  indicates  less 


variablity  of  E(T  )  as  the  number  of  t.  in  [ t„ ,  t„]  and  n.  are 
s  1    '  0   F'      1 

increased.   However,  in  applications,  'large'  n.  and  'large'  number  of 


irn^ 
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t,  in  [t-,  t  ]  are  difficult  to  determine  unless  the  researcher  has 

similar  information  from  a  previous  experiment. 

Var  (T  )  and  Var  (T  )  are  reasonable  estimators  for  Var(T  )  for 

beta,  uniform  and  normal  survival  distributions,  especially  for  large 
n,  and  t.  in  [t,.,  t  ],  but  rapidly  underestimate  VarCT  )  as  the  number 

of  t,  in  [t.,  t  ]  increase,  especially  for  smaller  n, .   Var  (T  )  would 

probably  be  preferred  for  a  small  number  of  t  in  [t,.,  t_]  for  concave 

i     u   F 

or  straight-line  survival  distributions  (see  graphing  of  p.    above), 

and  Var  (T  )  would  probably  be  preferred  for  survival  distributions 

with  similar  shapes  to  the  normal,  exponential  or  gamma  used  in  the 
simulations.   For  large  n,  and  a  large  number  of  t.  in  [t,.,  t  ]  either 

variance  estimator  could  be  used  to  estimate  Var(T  )  since  the  two  are 

s 

nearly  equal  for  these  conditions  as  indicated  by  the  simulations  and 

Lemma  1.   However,  note  again  that  both  estimators  tend  to 

underestimate  Var(T  )  under  exponential  and  gamma  survival 

distributions . 

Regression  models  of  RMS  on  covariates  1/n  and  c  for  each 
estimator  reinforce  our  previous  conclusion  that  the  estimators 

perform  better  overall  as  At  is  decreased  and  as  n.  for  each  t.  is 

11 

increased. 
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6.  CONCLUSIONS 
We  recommend  the  nonparametric  estimators  outlined  in  this  report 
to  estimate  the  respective  parameter  values  under  the  experiment 
specified  in  Section  1.   Increasing  the  number  of  sample  times  t.  in 

[t_,  tp]  and  increasing  the  number  of  samples,  n.  for  each  t.  will 

result  in  better  estimates  of  E(T  )  and  VarfT  ).   Better  estimates  of 

s  s 

p.    will  be  obtained  by  taking  more  t.  when  frequent  stage 

transitions  are  occurring.   For  example,  in  Section  3,  more  t.  between 

t-  -  9  and  t,  -  15  would  have  resulted  in  better  estimates  of  p.  ,  and 
3  6  1,1 

p.  ^  and,  hence,  better  estimates  of  E(T  )  and  Var(T  ). 
1,2  ^  s  s 

Because  the  estimators  appear  to  best  estimate  concave  or 
straight-line  survival  distribution  parameter  values,  graphing  p. 
may  aid  the  researcher  in  determining  how  well  the  estimators  may  be 
performing  in  his/her  experiment.   However,  based  on  graphs  of  p 

from  simulations,  the  graphs  of  p.    may  show  considerable  variability 
across  t,  and  yield  no  discernable  shape  of  the  survival  distribution. 

Because  Var_(T  )  >  Var. (T  ),  a  conservative  overall  choice  of  a 

Is        L   s 

variance  estimate  would  be  Var„(T  ).   However,  if  a  graph  of  p.    has 

a  similar  shape  to  one  of  the  survival  distributions  in  Section  5, 
then  the  selection  of  a  variance  estimate  could  be  based  on  the 

simulation  results  outlined  in  Section  5,4.   If  At  are  small  and  n.  > 

1 

20  then  either  variance  estimate  is  appropriate.   Note  again  that  the 


"?:► 
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experiment  requires  that  the  cohort  of  organisms  all  begin  in  stage  0 
and  complete  development  in  stage  A.   However,  slight  deviations  from 
these  conditions  probably  do  not  significantly  effect  the  estimates, 
especially  for  small  At  and  large  number  of  n. . 

Some  suggested  extensions  for  further  research  are:  (1)  using  the 
fact  that  the  p,    contain  doubly  censored  information  to  possibly 

obtain  better  estimates  of  parameter  values,  (2)  estimating  E(T  )  and 

Var(T  )  under  right  and/or  left  censoring  of  sample  times,  and  (3) 

deriving  estimators  when  failures  (deaths)  can  be  observed  at  each  t. . 
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APPENDIX  A 
COMPUTER  PROGRAM  SOURCE  CODE 


/*SERVICE      UNATTEND 

/*REGION       500K 

//  EXEC  SAS  ,  OPTIONS-MACRO 

//SYS  IN  DD  * 

OPTIONS  LS-85; 


*(§  PROGRAM  DURPROG:  g 

*@  ____>  LAST  REVISED  NOVEMBER  06,  1986  BY  J.S.  PONTIUS.  @ 

*(§  CALCULATION  OF  EXPECTED  VALUE,  VARIANCE  (EXPECTED  VALUE)  AND  @ 
*(a  VARIANCE  OF  TIME  TO  REACH  STAGE  S  AND  DURATION  FOR  STAGE  S  FOR  ONE  g 

*§  SAS  DATA  SET  THAT  CONTAINS  VALUES  OF  SAMPLE  TIMES  AND  COUNTS  (0,  1,@ 

*(a  2,  ...  )  OF  ITEMS  IN  A  STAGE  FOR  EACH  SAMPLE  TIME.  @ 

*(§  PROGRAM  CODED  IN  SAS  MACRO  LANGUAGE  @ 

*@          (  JCL  REQUIRED  FOR  SAS  MACRO  LANGUAGE:  (3 
*@                  //  EXEC  SAS, OPTIONS-MACRO                   ).   @ 

*@  @ 

*@  NOTE:    DURPROG  CAN  HANDLE  A  MAXIMUM  OF  20  STAGES.  (§ 

*@         (NS  -  NUMBER  OF  STAGES).  (3 

*@  NOTE:    INPUT  DATA  SET  MUST  CONTAIN  AT  LEAST  3  SAMPLE  TIMES.  @ 

*@  e 

*@  DURPROG:  @ 
*@     (1)  READS  IN  A  HARRIS  $ADD  DATA  FILE  (CAN  SUBSTITUTE  DATA  CARDS@ 

*@         IN  PLACE  OF  $ADD  STATEMENT) ,  @ 

*(a     (2)  CHECKS  (A)  THAT  ALL  COUNT  DATA  ARE  NONEGATIVE  INTEGERS  (? 
*@         (B)  THAT  SAMPLE  TIMES  ARE  >-  0  AND  ARE  A  POSITIVE  SEQUENCE. @ 

*(a     (3)  %MACRO  _INSECT_:  % 

*(a         CALCULATES  EXPECTED  VALUE,  VARIANCE  (EXPECTED  VALUE)  AND  @ 

*@         VARIANCE  OF  TIME  TO  REACH  STAGE  S  AND  DURATION  FOR  STAGE  S.g 

*(§     (4)  %MACRO  _LOOPER_:  FORMATS  OUTPUT  FOR  PRINTING.  § 

*@   INPUT  DATA  SET:  @ 

*@         THE  FORMAT  FOR  THE  INPUT  SAS  DATA  SET  IS  AS  FOLLOWS  @ 

*(?         (USER  SPECIFIED  INFORMATION  IN  [  ]  )  :  @ 

*@  @ 

*(a         DATA  [VALID  SAS  DATASET  NAME]  ;  *@ 

*@           LENGTH  DATAID  $21;  *@ 

*§           DATAID-  '[STUDY  IDENTIFIER  OF  1  TO  21  CHARACTERS]';  *(§ 

*§           INPUT  (DAY  COUNTl  -  COUNT[NS])  (3.  [NS]*2.);  *@ 

*@           CARDS ;  *@ 

*@           [DATA  ENTERED  IN  COLUMN  FORMAT]  @ 

*(?  @ 
*(§          REQUIREMENTS:  (A)  DAY  IS  THE  TIME  WHEN  SAMPLES  WERE  TAKEN,  @ 

*(i                      (B)  COUNTl  TO  COUNT  [NS]  ARE  NONEGATIVE  @ 
*@                            INTEGER  COUNTS  OF  ITEMS  SAMPLED  AT  EACH   g 

*@                          SAMPLE  TIME.  @ 

*§         EXAMPLE:  DATA  ONE;  *@ 

*@                  LENGTH  DATAID  $21;  *@ 
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*(?  DATAID-  '  STUDY  ONE '  ;                          *@ 

*@  INPUT  (DAY  COUNTl  -  C0UNT3)  (3.  3*2.);         *@ 

*@  CARDS ;                                       *@ 

*@  315  0  0                                     @ 

*@  10  9  8  0                                   @ 

*(?  .                                      @ 

*@  @ 

*@  .                                      @ 

*@  REFERENCE :  @ 

*@  @ 

*(?  (a 


*****DEFINE  GLOBAL  MACRO  VARIABLES:  NS-  NUMBER  OF  STAGES; 

*  DDS-  CURRENT  DATA  SET; 

*  STUDYID-  DATA  SET  IDENTIFIER; 

*  CNT-  ARRAY  OF  COUNTS; 
%GLOBAL  NS  DDS  STUDYID  CNT  ; 

*; 

*********************************************************************** 

*  SUBROUTINE  _FRMT_:  * 

*  CALLED  BY  MAIN  PROGRAM.  * 

*  FORMATS  STAGES  ACCORDING  TO  VALUE  FORMATING  FOR  PRINTING  RESULTS.   * 

*  INPUTS:  Fl  -  F20  (STAGE  IDENTIFIERS),  * 

*  OUTPUTS:  Fl  -  F20  (FORMATTED  STAGE  IDENTIFIERS).  * 
************************************************************************ 


%MACRO  _FRMT_  (Fl , F2 , F3 , F4 , F5 , F6 , F7 , F8 , F9 , FIO , Fll , F12 , F13 , F14 , F15 , F16 , 

F17,F18,F19,F20); 
PROC  FORMAT  ; 

VALUE  STAGEFMT  1-&F1 

2-&F2 

3-&F3 

4-&F4 

5-&F5 

6-&F6 

7-&F7 

8-&F8 

9-&F9 
10-&F10 
H-&F11 
12-&F12 
13-&F13 
14-&F14 
15-&F15 
16-&F16 
17-&F17 
18-&F18 
19-&F19 
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20-&F20; 


%MEND   FRMT 


* 

*  SUBROUTINE  _INSECT_  :  * 

*  CALLED  BY  _LOOPER_.  * 

*  CALCULATION  OF  EXPECTED  DURATION  TIMES,  VARIANCE  OF  EXPECTED       * 

*  DURATION  TIMES,  VARIANCE  OF  DURATION  TIMES  AND  OUTPUT  MATRICES  FOR* 

*  PRINTING  RESULTS,  * 

*  INPUTS ;  DAY  (UNIT  OF  TIME  MEASUREMENT) ,  * 

*  &CNT  (ARRAY  OF  STAGE  COUNTS),  * 

*  INDATA  (INPUT  DATA  SET).  * 

*  OUTPUTS:  OUTDATA  (OUTPUT  DATA  SET  OF  EXPECTED  VALUES  &  STND  DEV  OF* 

*  EXPECTED  VALUES) .  * 

*  OUTDEV  (OUTPUT  DATA  SET  OF  STND  DEV  OF  DURATION  TIMES.  * 
********************************************************************* 
*; 

%MACRO  _INSECT_  (DAYVAR,X1  ,X2  ,X3  ,X4,X5  ,X6  ,X7  ,X8  ,X9  ,X10  ,X11  ,X12  ,X13  , 

X14,X15,X16,X17,X18,X19,X20,INDATA-INDTA,OUTDATA-OUTDTA, 

OUTSTD-OUTDEV)  ; 
%*; 

PROC  MATRIX; 
%******************  DAY-  VECTOR  OF  TIMES,  COUNT-  MATRIX  OF  STAGE  COUNTS; 
FETCH  DAY  DATA-&INDATA  (KEEP-&DAYVAR)  ; 
FETCH  COUNT  DATA-6eINDATA  (KEEP-&X1-6.&X&NS)  ; 
%*; 

N_STAGE-NCOL(COUNT)  ;  %*NUMBER  OF  STAGES; 

SAMSIZE-COUNT(,+);  %*TOTAL  SAMPLE  SIZE  PER 

%*  SAMPLE  TIME 

PROB-COUNT#/(SAMSIZE  (?  J(l ,N_STAGE) ) ;  %*MATRIX  OF  PROPORTIONS 
%*; 

%********************1F  INITIAL  TIME  >  0  THEN  ADD  TIME-  0  TO  FOLLOWING; 
%*  MATRICES ; 

IF  DAY(1,)  >  0  THEN  DO; 
DAY-J(1,1,0)//DAY; 
WK-J(1,NCOL(PROB),0); 
WK(1,1)-1; 
PROB-WK//PROB ; 
SAMSIZE-J(1,1)//SAMSIZE; 
COUNT-J  ( 1 ,  NCOL  ( COUNT  )  ,  0  )  //COUNT ; 
C0UNT(1,1)-1; 
END; 

N_DAY-NROW(DAY)  ;  %*NUMBER  OF  SAMPLE  TIMES; 

%*; 

%********************  CALCULATE  DIFFERENCE  OR  SUM  BETWEEN  I  &  I+l  TIMES; 
%*  T(I+1)  -  T(I); 
TIMEINC  -  DAY(2:N_DAY,)-DAY(1:N_DAY-1,); 


50 


%*  T(I+1)##2  -  T(I)##2; 
TIMEINC2-  (DAY(2:N_DAY,)##2)  -  (DAY(1:N_DAY  -  1,)##2); 

%*  T(I+1)  +  2#T(I) ; 
SM2TERM1-  DAY(2:N_DAY,)  +  (2#DAY(1:N_DAY-1 , ) ) ; 

%*  2#T(I+1)  +  T(I) ; 
SM2TERM2-  (2#DAY(2 :N_DAY, ) )  +  DAY(1:N_DAY-1, ) ; 
%*********************  CALCULATE  DIFFERENCES  BETWEEN  I-l  &  I+l  TIMES; 

T2-  TIMEINC(2:N_DAY-1,)  +  TIMEINC(1  :N_DAY-2  , ) ; 
%*********************  SQUARE  TIME  DIFFERENCES  FROM  PREVIOUS  LINE; 

T2-  T2#T2; 
%*; 
%*; 

%********************  LOOP  THROUGH  SUCCESSIVE  STAGES; 
DO  STAGE-2  TO  N_STAGE; 
WK-PR0B(,I:STAGE-1); 

P-WK(,+);      %*SUM  PROPORTIONS  IN  STAGES  1 S  -  1 

%********************  CALCULATE  EXPECTED  TIME, E(T(S) ) ,T0  STAGE  2,..., A 
MU_K_HAT-0.5*(P(1:N_DAY-1,)#TIMEINC  +  P(2  :N_DAY,  )#TIMEINC) 
MU_K_HAT-MU_K_HAT(+, ) ; 
%********************  CALCULATE  STANDARD  ERROR, SQRT(VAR(E(T) ), OF 

%*  EXPECTED  TIME  TO  STAGE  2 A 

STDERR-P#(J(N_DAY,1)-P)#/SAMSIZE;  %*BINOMIAL  VARIANCE  FOR 


S-0 .  25*STDERR(2  :  N_DAY- 1 , ) #T2  ; 

S-S(+,); 

STDERR-SQRT(S(1,)); 


%*     EACH  TIME  I , . . 
%*VAR  FOR  TIMES  1  TO 


%*; 

%*********************  GET  VECTOR  OF  PROPORTIONS  OF  STAGES  TO 

%*  CALCULATE  EXPECTED  DURATION  IN  STAGE  S*  -  S , 

%*  E(T(S*)  -  T(S)). 

DURAT-PROB( , STAGE- 1) ; 
%*; 

%*********************  CALCULATE  STANDARD  ERROR (EXPECTED  DURATION) 
%*  VAR[E(T(S*)  -  T(S))]  FOR  STAGE  S*  -  S . 

%*  NOTE:  FOR  STAGE-  2,  STDERR-  SDIFF. 

SDIFF-DURAT#(J(N_DAY,1)-DURAT)#/SAMSIZE;  %*BINOMIAL  VAR 

%*  FOR  EACH  TIME. 

S-0.25*SDIFF(2:N_DAY-1,)#T2; 

S-S(+,); 

SDIFF-SQRT(S(1,)); 
%*********************  CALCULATE  SECOND  MOMENTS  FOR  VARIANCES  OF  TIMES 
%*  TO  REACH  STAGE  S,  VAR(T(S)). 

SECMOMl-  (P(1:N_DAY-1,)  +  P(2  :N_DAY,  )  )#TIMEINC2  ; 

SECMOMl-  SECM0Ml(+,)#/2; 

SECM0M2-  P(I:N_DAY-l,)#SM2TERjyil  +  P(2  :N_DAY,  )#SM2TERM2  ; 

SECM0M2-  SECM0M2#TIMEINC; 

SECM0M2-  SECMOM2(+,)#/3; 
%*; 

%*********************  FORMAT  RESULTS  OF  CALCULATIONS  FOR  PRINTING; 

%*  IF  FIRST  PASS  THROUGH  LOOP  ; 
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IF  STAGE-2  THEN  DO;     %*  (IE;  STAGE-2)  ; 

DURAT-MU_K_HAT ; 
STDIFF-1; 

RDIFF-STDIFF  [[  DURAT  [[  SDIFF; 
RESULTS-STAGE  [[  MU_K_HAT  [[  STDERR; 
END; 

%*  IF  STAGE  ->  2  ; 

ELSE  DO; 
%*********************  CALCULATE  EXPECTED  DURATION  FOR  STAGE  S*  -S; 
%*  E(T(S*)  -  T(S)),  AND  FORMAT  RESULTS.        ; 

DURAT-MU_K_HAT - RESULTS ( STAGE -2,2); 
RDIFF-RDIFF  //  ((STAGE-I)  [[  DURAT  [[  SDIFF); 
RESULTS-RESULTS  //  (STAGE  [ [  MU_K_HAT  [ [  STDERR) ; 
END; 
%********************CALCULATE  STANDARD  DEVIATIONS  OF  TIME  TO  REACH   ; 
%*  STAGES,  SQRT(VAR(T(S))] .  ; 

VARl-  SECMOMl  -  (MU_K_HAT##2) ; 
STDEVl-  SQRT(VARl); 
VAR2-  SECM0M2  -  (MU_K_HAT##2) ; 
STDEV2-  SQRT(VAR2); 
**  FORMAT  STANDARD  DEVIATION  MATRICES  FOR  OUTPUT; 

IF  STAGE-  2  THEN  STORES-  STAGE  [ [  STDEVl  ( [  STDEV2 ; 
ELSE  DO; 

TEMP-  STAGE  [ [  STDEVl  [ [  STDEV2 ; 
STORES-  STORES  //  TEMP; 
END; 
END; 
%*************************  FORMAT  MATRICES  FOR  OUTPUT; 
S-1 ;   M-0 ;   V-0 ; 

RESULTS- (S  [[  M  [[  V)  //  RESULTS; 
S-N_STAGE ; 

RDIFF-RDIFF  //  (S  [ [  M  ( [  V) ; 
RESULTS-RESULTS  ( [  RDIFF( , 2 ; 3) ; 

OUTPUT  RESULTS  OUT-&OUTDATA  (RENAME- (COLl-STAGE  C0L2-ESTIMATE 

C0L3-STDERR  C0L4-DIFF 
COLS-  STD_DIFF) ) ; 
OUTPUT  STORES   OUT-&OUTSTO   ( RENAME- (COLl-  STAGE  C0L2-  SDl 

C0L3-  S02)); 
%MENO  INSECT  ; 


*********************************************************************** 

*  SUBROUTINE  _LOOPER_:  * 

*  CALLED  BY  MAIN  PROGRAM.  * 

*  PRINTS  INPUT  DATA  SET,  CALLS  _INSECT_  FOR  DATA  PROCESSING,  CALLS  * 

*  PROC  MEANS  FOR  CALCULATION  OF  USUAL  MEANS  ANO  STANDARD  DEVIATIONS  * 

*  OF  DURATION  TIMES,  ANO  FORMATS  RESULTS  FOR  OUTPUT  PRINTING.  * 

*  INPUTS:  DOS  (INPUT  DATA  SET),  * 

*  UNIT    (UNIT  OF  SAMPLE  TIME  MEASUREMENT) .  * 
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*      OUTPUTS:    NONE.  *; 

*; 

%MACRO  _LOOPER_(INDSET,UNIT) ; 
DATA  DATAl; 

SET  6.INDSET; 

CALL  SYMPUT( ' STUDYID' .DATAID) ; 
%******************  PRINT  OUTPUT  HEADER  AND  DATA  SET; 
PROC  PRINT; 

TITLEl  'ESTIMATION  OF  TIME  TO  AND  DURATION  OF  STAGE'; 

TITLE2  'FREQUENCY  DATA,  METHOD  OF  BOYER  AND  DEATON.'; 

TITLES  'PROGRAM  REVISED;  SEPTEMBER,  1986  BY  JS  PONTIUS.'; 

TITLE4  STUDY:  &STUDYID; 

TITLES  UNIT  OF  TIME  MEASUREMENT:  &UNIT; 

VAR  DAY  COUNTl-COUNT&NS; 

%_INSECT_(DAY,  &CNT , 

INDATA-%SCAN(&SYSDSN,2)  ,0UTDATA-DTA1,0UTSTD-DTA2)  ; 
%*****************  SET  EXPECTED,  VARIANCE  OF  AND  VARIANCE  OF  EXPECTED  ; 
%*  DURATION  TIMES  TO  MISSING  FOR  FIRST  AND  LAST  STAGES; 

DATA  DTAl; 
SET  DTAl; 

IF  STAGE-1  THEN  DO; 
ESTIMATE- . ; 
STDERR-. ; 
END; 

IF  STAGE-&NS  THEN  DO; 
DIFF-. ; 
STD_DIFF-.; 
END; 
%*********************  PRINT  RESULTS  OF  EXPECTED  VALUES  &  VAR(EXPECTED; 
%*  VALUES).  ; 

PROC  PRINT  SPLIT-' #'  DATA-  DTAl; 
ID  STAGE; 

VAR  ESTIMATE  STDERR  DIFF  STD_DIFF; 
FORMAT  STAGE  STAGEFMT.; 

LABEL  ESTIMATE-'   TIME  TO#REACH  STAGE#(  E(T(S))  )' 
DIFF-'    DURATION  TIME* (  E(T(S))  -  T(S"))  )' 
STDERR- 'STD  ERROR  0F#   E(T(S))' 
STD_DIFF-'   STD  ERROR  OF#E(T(S)  -  T(S"))'; 
%*********************  PRINT  RESULTS  OF  VAR(T(S)); 
PROC  PRINT  SPLIT-'#'  DATA-  DTA2 ; 
ID  STAGE; 
VAR  SDl  SD2; 
FORMAT  STAGE  STAGEFMT . ; 

LABEL  SDl-  'STD  DEVIATION  OF  T(S)#-TRAPEZOID  ANALOG-' 
SD2-  'STD  DEVIATION  OF  T(S)#  -STRAIGHT  LINE-'; 
%MEND  LOOPER  ; 
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*  SUBROUTINE  _DATACK_:  * 

*  CALLED  BY  MAIN  PROGRAM.  * 

*  CHECKS  INPUT  DATA  SET  FOR  NEGATIVE  AND  IMPROPERLY  SEQUENCED     * 

*  SAMPLE  TIMES.  * 

*  CHECKS  INPUT  DATA  SET  FOR  ILLEGAL  NEGATIVE  AND  REAL  COUNT  DATA.  * 

*  INPUTS:  &DDS   (INPUT  DATA  SET).  * 

*  OUTPUTS:  NONE.  * 
********************************************************************* 
%MACRO  _DATACK_; 

DATA  CHECK; 
SET  &DDS; 
%************  CHECK  FOR  SAMPLE  TIME(J)  <-  SAMPLE  TIME (J  -  1); 
PROC  MATRIX; 

FETCH  OBSDATA  DATA-  CHECK; 
STIME-  0BSDATA(,1); 
NDAY-  NROW( STIME) ; 
IF  NDAY  ->  2  THEN  DO; 
OUTMAT-  J (NDAY, 2 , 0) ; 
DO  K-  2  TO  NDAY; 

IF  ABS(STIME(K,))  <-  ABS(STIME(  (K-1)  ,  )  )  THEN  DO; 
0UTMAT(K,1)-  K; 
0UTMAT(K,2)-  STIME(K,); 
END; 
END; 
END; 

OUTPUT  OUTMAT  OUT-  DAYERR  (RENAME-  (COLl-  OBSNUMB  C0L2-  DAY)); 
DATA  DAYERR; 
SET  DAYERR; 
IF  OBSNUMB  >  0  THEN  DO; 

PUT  '— — >ERROR:  SAMPLE  TIME  IS  LESS  THAN  PREVIOUS  SAMPLE  TIME 
PUT  '  '  OBSNUMB-; 

PUT  '  '  DAY-; 

END; 
DATA  CHECK; 
SET  CHECK; 
%***************  CHECK  FOR  SAMPLE  TIME  <  0.0; 
IF  DAY  <  0.0  THEN  DO; 

PUT  '- >ERROR:  SAMPLE  TIME  <  ZERO.'; 

PUT  '  '  _N_-  ; 

PUT  '  '  DAY-; 

END; 
%*; 

%D0  I-  1  %T0  &NS; 
%***************  CHECK  FOR  NEGATIVE  COUNT  DATA; 
IF  COUNT&I  <  0.0  THEN  DO; 

PUT  '— — >ERROR:  COUNT  DATA  VALUE  IS  NEGATIVE.'; 
PUT  '  '  _N_-; 

PUT  '  '  COUNT&I- ; 

END; 
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%**************  CHECK  FOR  REAL  COUNT  DATA; 

IF  (COUNT&I  -  INT(COUNT&I))  >  0.0  THEN  DO; 

PUT  ' — — >ERROR:  COUNT  DATA  VALUE  IS  NOT  AN  INTEGER. 
PUT  '  '  _N_-; 

PUT  '  '  COUNT&I- ; 

END; 
%END; 
PROC  DELETE  DATA-  CHECK  DAYERR; 
%MEND  _DATACK_; 
*• 


*********************************************************************** 

*  MAIN  PROGRAM:  * 

*  STATEMENTS  REQUIRED  FOR  PROGRAM  EXECUTION:  * 

*  (1)  HARRIS  $ADD  DATA  FILE  (OR  SUBSTITUTE  DATA  SET  AS  DESCRIBED  * 

*  IN  PROGRAM  HEADER) ,  * 

*  (2)  SPECIFY  THE  NUMBER  OF  STAGES  TO  BE  ANALYZED  (IE;  NS)        * 

*  (3)  ASSIGN  THE  ARRAY,  CNT,  ELEMENTS  OF  VARIABLES  COUNTl  TO      * 

*  COUNT [NS],  * 

*  (4)  ENTER  LABELS  FOR  STAGES  IN  _FRMT_  SUBROUTINE  PARAMETER  LIST,* 

*  (5)  ENTER  SAMPLE  TIME  UNIT  (MEASUREMENT)  IN  2ND  SLOT  OF  * 

*  PARAMETER  LIST  IN  SUBROUTINE  _LOOPER_.  * 
*********************************************************************** 

*; 

*****ADD  DATA  CARDS  HERE:     *****************************************; 

*(i); 

$ADD  LTESTA 

*****MAIN  PROGRAM  STATEMENTS  BEGIN  HERE:      **************************; 

%LET  DDS-%SCAN(&SYSDSN,2); 
*(2);         %LET  NS-  3; 

%PUT  NOTE:  DATA  SETS  CHECK  &  DAYERR  ARE  FOR  ERROR 
ROUTINES . ; 

%_DATACK_; 

*  (  3  )  ;         %LET  CNT-COUNTl ,  C0UNT2  ,  C0UNT3  ; 

%PUT  NOTE:  DATA  SET  CHECK  IS  FOR  ERROR  ROUTINES.; 

%FUT  NOTE:  DATA  SETS  CHECK  &  DAYERR  ARE  BEING  DELETED.; 

*  (  4  )  ;         %_FRMT_  (  EGGLARVA ,  PUPA ,  ADULT  )  ; 
*(5);         %_LOOPER_(&DDS,DAYS); 
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ABSTRACT:   We  derive  and  evaluate,  by  simulation,  estimators  for  the 
following  experiment.  Consider  an  organism  that  displays  observable 
stages.   Choose  a  sequence  of  fixed  sample  points  in  time.   At  each 
sample  point,  observe  a  subset  of  a  cohort  of  organisms  and  record  the 
number  of  organisms  in  each  stage.   Our  objective  is  to  estimate 
parameters  of  the  time  in  stage  s,  T  ,  for  an  organism.   Ue  review 

estimators  of  the  time  to  stage  s,  E(T  ),  mean  duration  time, 

E(T  -  T  ,),  and  the  variance  of  E(T  ),  Var(E(T  )),  proposed  by  Boyer 

and  Deaton  (1984).   We  derive  two  variance  estimators  and  prove  two 
relational  properties.   Simulation  results  under  five  survival 
distributions  indicate  that  the  estimators  provide  reasonable 
estimates  of  parameter  values.   The  estimators  better  estimate  the 
parameter  values  as  the  number  of  sample  times  is  increased  in  a 
finite  interval  and  as  the  number  of  samples  per  sample  time  is 
increased.   The  estimators  are  useful  in  studies  on  survival  data, 
quality  control,  and  other  studies  in  life  sciences  and  engineering. 
We  also  describe  a  computer  program  to  calculate  the  estimates. 


